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Abstract 

The subject of this paper is the problem of estimating service time distribution of the 
M/G/oo queue from incomplete data on the queue. The goal is to estimate G from obser¬ 
vations of the queue-length process at the points of the regular grid on a fixed time interval. 

We propose an estimator and analyze its accuracy over a family of target service time dis¬ 
tributions. The original M/G/oo problem is closely related to the problem of estimating 
derivatives of the covariance function of a stationary Gaussian process. We consider the 
latter problem and derive lower bounds on the minimax risk. The obtained results strongly 
suggest that the proposed estimator of the service time distribution is rate optimal. 
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1 Introduction 


Suppose that customers arrive at a system at time instances {r^,/ € Z}, obtain service upon 
arrival, and leave the system at time instances G Z} after the service is completed. A jth 
customer arriving at tj requires service time aj , so that its departure epoch is yj = tj + aj . If 
{Tj,j € Z} is a realization of a stationary Poisson process on R, and {cr^,/ € Z} are non-negative 
independent random variables with common distribution G, independent of {t^ , j € Z}, then the 
above description corresponds to the M/G/oo queueing system. In this paper we are interested 
in estimating service time distribution G from incomplete data on the queue. 

The M/G/oo system is perhaps one of the most widely studied models in queueing theory; its 
probabilistic properties are fairly well understood. However statistical inference in such models 
has attracted little attention. 

The problem of estimating service time distribution G in the M/G/oo queue has been studied 
under different assumptions on the available data. The following three observation schemes have 
been considered in the literature: 

*The author is grateful to Gideon Weiss for attracting his attention to the problem studied in this paper, 
and to Oleg Lepski for useful discussions and suggestions. Part of this work has been done while the author was 
visiting NYU Shanghai. 
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(i) observation of arrival {r^, j € Z} and departure {yj,j € Z} epochs without their matchings; 

(ii) observation of the queue-length (number-of-busy-servers) process {X(t)}; 

(hi) observation of the busy-period process {l(X(t) > 0)}. 


We note that observation schemes (i) and (ii) are equivalent up to initial conditions on the 
queue length. In particular, arrival and departure epochs are uniquely determined by the queue- 
length process, while the queue length can be reconstructed from the input-output data provided 
that the initial state of the queue is known. 


In setting (i) Brown (1970) proposed an estimator of G which is based on the idea of pairing 
every departure epoch with the closest arrival epoch to the left. Differences between these 
epochs constitute an ergodic stationary random sequence whose marginal distribution is related 
to the service time distribution G by a simple formula. Then estimation of G can be achieved 
by inverting the formula and substituting the empirical marginal distribution of the differences. 
Brown (1970) proved that the proposed estimator is consistent. Recently Blanghaps et al. (2013) 
extended the work of Brown; they showed that pairing of a departure epoch with the i —closest 
arrival epoch to the left can be worthwhile. 


Nonparametric estimation of service time distribution G under observation schemes (ii) and 
(hi) was considered in Bingham & Pitts (1999). It is well known that in the steady state the 
queue-length process {X{t)} is stationary with Poisson marginal distribution and correlation 
function 


H{t) = l-G*it), G*{t):= 


[1 — G(a;)]da; 
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[1 — G(a;)]da:; 


( 1 ) 


see, e.g., Benes (1957) and Reynolds (1975). This fact suggests that function G* can be recon¬ 
structed by estimating correlation function of the queue-length process. The work of Bingham & 
Pitts (1999) discusses this approach and provides standard results from the time series literature 
for estimators of G*. The idea of reconstructing the service time distribution from correlation 
structure of the queue-length process was also exploited by Pickands & Stine (1997). The model 
considered in that paper assumes that a Poisson number of customers arrives at discrete times 
1, 2,..., T, and service times are i.i.d. random variables taking values in the set of non-negative 
integer numbers. In this discrete setting estimation of the service time distribution is equivalent 
to estimating a linear form of the correlation function of the queue-length process. For the latter 
problem standard results from the time series literature are applicable. Other related work is 
reported in Brillinger (1974), Bingham & Dunham (1997), Hall & Park (2004), Moulines et al. 
(2007), Griibel & Wegener (2011), Schweer & Wichelhaus (2014); see Blanghaps et al. (2013) 
for additional references. 


Although estimation of G under different observation schemes was considered in the literature, 
the most interesting and important statistical questions remain to be open. In particular, it is 
not clear what is the achievable estimation accuracy in such problems, and how to construct 
optimal estimators. The goal of this paper is to shed light on some of these issues. 

In this work we adopt minimax approach for measuring estimation accuracy. It is assumed 
that the estimated distribution G belongs to a given functional class, and accuracy of any 
estimator is measured by its worst-case mean squared error on the class. The functional class 
is defined in terms of restrictions on smoothness and tail behavior of G (for precise definitions 
see Section 2). We concentrate on the observation scheme (ii) when the queue-length process 
is observed on a fixed interval at the points of the regular grid. We want to estimate G at a 
fixed point using such observations. From now on we will refer to this setting as the MjGjoo 
estimation problem. 

We develop an estimator of G which is based on the relationship between distribution G 
and covariance function of the queue-length process, as discussed in Bingham & Pitts (1999) 
and Pickands & Stine (1997) [cf. (1)]. In particular, estimating G at a fixed point is reduced 
to estimating derivative of the covariance function of the queue-length process at this point. 
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We analyze accuracy of our estimator over a suitable class of target distributions and derive an 
upper bound on the maximal risk. The upper bound is expressed in terms of the functional class 
parameters and the observation horizon. The problem of estimating the arrival rate is discussed 
as well. 

A natural question is: what is the achievable estimation accuracy in the MjGjoo problem? 
This question calls for a lower bound on the minimax risk. Since explicit formulas for finite 
dimensional distributions of the queue-length process in the M/G/oo model are not available, 
derivation of lower bounds on the minimax risk seems to be analytically intractable. Therefore, 
driven by a Gaussian approximation to the queue-length process, we consider a closely related 
estimation problem for a Gaussian model. Specifically, let {X{t),t G R} be a continuous-time 
stationary Gaussian process which is observed at the points of a regular grid on a given time 
interval. Using such discrete observations we want to estimate the derivative of the covariance 
function of {X{t)^t G R}. We derive a lower bound on the minimax risk in this problem, and 
show that under suitable conditions it converges to zero at the same rate as the risk of our 
estimator in the MfGjoo estimation problem. This fact strongly suggests that our estimator of 
the service time distribution is rate-optimal. 

The problem of estimating derivatives of covariance functions at a fixed point (or, more gen¬ 
erally, linear functionals of covariance functions/spectral densities) from discrete observations is 
interesting in its own right. Although various settings were considered in the literature, we are 
not aware of any work dealing with estimation of covariance function derivatives. For discrete¬ 
time stationary processes asymptotic efficient estimators of smooth functionals of the spectral 
density were proposed in Hasminskii & Ibragimov (1986); see also Ginovyan (2011), where 
continuous-time stationary processes and continuous observations were considered. Nonpara- 
metric estimation of covariance functions for continuous-time stationary processes from discrete 
observations is discussed in Hall et al. (1994) and Hall & Path (1994). For other related work we 
refer to Masry (1983), Haberzettl (1997), Srivastava & Sengupta (2010) and references therein. 

The rest of this paper is structured as follows. Section 2 contains formal statement of the 
MjGjoo estimation problem. Section 3 presents some results on properties of the queue-length 
process; these results are instrumental for subsequent developments in the paper. In Section 4 we 
consider the MjGjoo estimation problem, define our estimator and establish upper bounds on 
its maximal risk. Section 5 deals with the problem of estimating the arrival rate in the MjGjoo 
queue. In Section 6 we relate the MjGjoo problem to the problem of estimating derivative of 
covariance function of a continuous-time stationary Gaussian process, and derive a lower bound 
on the minimax risk for the latter problem. Proofs are given in Section 7. 


2 Problem formulation 

Let {tjG G be arrival epochs constituting a realization of stationary Poisson process point 
process of intensity A on the real line. The service times G Z} are positive independent 

random variables with common distribution G, independent of {tj,j G Z}. Assume that the 
system is in the steady state; then the queue-length process {X{t),t G R} is given by 

X{t) = ^ l(Tj <t,aj > t — Tj), t G R. (2) 

tez 

Suppose that X{t) is observed on the time interval [0,T] at the points of the regular grid 
ti = i6, i = l,...,n, where 5 > 0 is the sampling interval, and T = nS. Denote W" = 
{X(ti),..., X(tn)) G R". Our goal is to estimate the distribution function G at single given 
point xo G R+ using observation X”. In Section 5 we also discuss the problem of estimating the 
arrival rate A from observation AT". 

Distribution of the observation X^ is fully characterized by the service time distribution G 
and by the arrival rate A. From now on Pg,a stands for the probability measure generated by 
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{Tj,j S Z} and {(Jjjj € Z} when aj’s are distributed G, and the arrival rate is A. Correspond¬ 
ingly, is the expectation with respect to Pg,a- In the problem of estimating G when the 

arrival rate A is known, we use notation Pg and Eg for the probability measure and expectation 
respectively. 


By estimator G(xo) = G(X"; xg) of G(xo) we mean any measurable function of the observa¬ 
tion X”. We adopt minimax approach for measuring estimation accuracy. Let if he a class of 
distribution functions; then accuracy of G(xq) is measured by the maximal mean squared risk 
over the class: 




sup 


EG|G(xo)-G(a:o)| 



The minimax risk is defined by 

7^:„[l^]=inf7^,„[G;l^], 

G 


where inf is taken over all possible estimators. We want to develop a rate-optimal {optimal in 
order) estimator G{xo) such that 




where G is a constant independent of the observation horizon T and the sampling interval 8. 

In the problem of estimating the arrival rate A from observation X” the estimation accuracy 
is measured similarly. If A = A(X”) is an estimator of A (a measurable function of X”) then the 
maximal risk of A is defined by 


7^[A;^] = sup [Eg.a|A-A|2]'/^ 

We will consider functional classes ^ which impose restrictions on smoothness and tail behavior 
of the distribution functions. The corresponding definitions are given in Section 4. 


3 Queue—length process 


Let 


pOO 

i := Eg[ct] = J [1 - G{t)]dt < oo 
with pL being the service rate, and let p := \/p, be the traffic intensity. Define 


H{t) •= M y [I ~ G(a:)]dx = 


[1 — G(a;)]da: 


■-./o 


[1 — G(a;)]dx, t £ IR+. (3) 


The function G* := 1—is often called the stationary-excess cumulative distribution function 
[see, e.g., (Whitt 1985)]. If G is a distribution function of an interval between points in a renewal 
process, then G* represents a distribution function of the interval between arbitrary time and 
the next renewal point. In our context, the important role of H stems from the fact that it is 
the correlation function of the queue-length process {X{t),t G R}; see Proposition 1 below. 

Observe that H{0) = 1, and H is monotone decreasing on the positive real line. Although 
function H is defined on R+ only, it will be convenient to extend its definition to the whole 
real line R by setting H{t) = H{—t) for t < 0. Prom now on we use the suffix notation 
Xi = X{ti) = X{iS), H, = H{ti) = H{id), etc. 


Proposition 1 The following statements hold. 

(i) For any t £ R the distribution of X{t) is Poisson with parameter p. 
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(ii) For any i, s € R 


EG[X(t)X(s)] = + pH {t-s). 

(iii) For any 9 = (0i,..., 0n), n > 1, one has 

n 

log Eg exp {S ea^}] = pSn{9), (4) 

n n—1 n—1 

5„(0) := ^ (e^™ - 1 ) + _ i). ( 5 ) 


m—1 k—1 m—k 

In particular, if 9* H) for some G R then 

n—1 


Sr^{9*) = n(e’’-l) + n(e’’-l)2^(l-^)e('=-i)’'Hfc. 


fc=i 


Remark 1 

(i) The statements (i) and (ii) are well known; in fact, they are immediate consequences 
of (iii). The first statement can be found in many textbooks [see, e.g., Parzen (1962, 
p. 147 ) and Ross (1970, p. 19)], while the second one appears, e.g., in Benes (1957) and 
Reynolds (1975). As for the part (iii), Lindley (1956) considered the special case of n = 3 
and discussed heuristically a derivation for general n. However, we could not find formula 
(4)-(5) in the literature, and, to the best of our knowledge, it is new. This formula plays 
an important role in subsequent derivations. 

(ii) The joint distribution of is the so-called multivariate Poisson; for details see, e.g., 
Lindley (1956, %2) and Milne (1970). The statements (i) and (ii) show that H is the 
correlation function of the process {X(t),t G R}. 

It is instructive to realize the form of (4)-(5) in the special case n = 4. Let l<*<j<fc< 
m < n; then 


4 logEG [exp{0iX, + 92Xj + O^X^ + 04^^}] = - 1) 

+ - l)(e®= - 1) + - l)e®^(e®^ - 1) 

+ - 1) + - l)(e®^ - 1) (6) 

+ Hm-j{e^^ - l)e®^(e®" - 1) + - l)(e®^ - 1). 

As it is seen from the above formula, the first term on the right hand side of (6) coincides with 
the cumulant generating function of independent Poisson random variables. The other terms 
are associated with all possible pairs of random variables. For every pair of random variables 
the corresponding term contains correlation between the variables, and factors (1 — e®) and e®, 
where (1 — e®)-factors correspond to the pair, and e®-factors correspond to the random variables 
“sandwitched” by the pair. 

The formula (6) allows to compute mixed moments of the fourth order as presented in the 
next statement. 

Proposition 2 Let l<i<j<k<m<n; then 

Eg = p'^ + {Hj-i + Hk-i + Hm-i + Hk-j + Hm-j + Hm-k) 

+ P^{Hk-i + Hm-j + ‘I.Hm-i + Hj-iHm-k + Hk-iHm-j + Hk-jHm-i) + pHm-i- 
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More generally, for any i,j, k,m € {1,..., n} and any subset I of indexes I C {i,j, k, m} define 
g/= |ii - i 2 |. Then 

+ T[\k-j\ + -ff|fc_i|) + 

EG[X,X,XkXm] 

= + P^i^lj-i] + ^|fe-j| + + H\k-j\ + C^) 

+ + Hqa,ic,m} + Hq{i,k,n,} + . 

As a by-product of statement (iii) in Proposition 1 we can easily obtain the following Gaussian 
approximation to finite dimensional distributions of the queue-length process {X(t), 0 < t < T}. 


Proposition 3 Consider a sequence oftheM/G/oo queueing systems, {Mi/G/oo,l = 1,2,...}, 
with the fixed service time distribution G , and with the l-th system characterized by the arrival 
rate Xi = IX, X > 0. Let Xp = {Xi^i,... ,Xid = (Ai/(ti),... ,Xi{tn)) be the vector of observa¬ 
tions of the queue-length process (2) in the l-th system; then 

4Ar„(0,S(g)), /^oo, 

where p = X/p, e„ = (1,..., 1) G M", and S(iJ) := {H{{i — 

The result of Proposition 3 is well known; it is in line with more general weak convergence 
results for queues in Borovkov (1967), Iglehart (1973) and Whitt (1974). The proof of Proposi¬ 
tion 3 follows immediately from Proposition l(iii), and it is omitted. 


4 Estimation of service time distribution 

According to Proposition l(ii) the covariance function of the queue-length process is 

R{t) := covG{^(s),X(s-|-f)} = pH{t). 

Therefore differentiation yields 


l-G{t) = -{R'{t), tGR+. 

This relationship is the basis for construction of our estimator of G(xo). 


( 8 ) 


4.1 Estimator construction 


Let 


and define 


n—k 

Pk — n—k ^ ^ A: = 0, 1, 

n—k 

^k — n—k ^ ^ {.^i Pk){_^i-\-k Pk)-, /c = 0, 1. 

2 = 1 


(9) 


Note that Rk is the empirical estimator of the covariance Rk = R{k5) = pH{kS), k = 0,1,..., n— 
1. For technical reasons we use estimator pk based on n — k observations and not on n. 
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Let /i > 0, and for every x € [0, T — (5] define the segment 

{ [x — h,x + h]^ h<x<T—S — h, 

[0,2h], 0<x<h, 

[T - S - 2h,T - 6 ], T-d-h<x<T-6. 

Let be the set of indexes fc G {1,..., n} such that kS € D^, := {k : kS G D^}, and let 

be the cardinality of this set, Njj^ := 

Fix positive integer £, and assume that 

h>^{i + 2)S. (10) 

For X G [0, T — (5] let {ak{x),k G Md^} denote the weights obtained as solution to the following 
optimization problem 


min 

subject to ^ ak{x) = 0, i^^x) 

k^Moa: 

ak{x){k6y = jx^~^, j = 

kGMDx 

We use the convention that if x = 0 and j = 1 then the right hand side of the last constraint in 
i^x) equals 1. 

By definition, if (10) holds then the linear filter associated with the weights {afc(x), k G 
has the following property: it reproduces without error the first derivative of any polynomial p 
of deg(p) < £ at point x. 


y] ak(x)p(kS) = p'(x), Vp : deg(p) < £. (11) 

keMo^ 


Now we are in a position to define our estimator of G'(xo): it is given by the formula 

G/t(a;o) = l + y ak(xo)^k, (12) 

where Rk = R{kS), k = 0,... ,n — 1 are defined in (9). 

The expression under the summation sign on the right hand side of (12) can be viewed as 
a local polynomial estimator of the derivative R'{xo) when the empirical covariances Rk are 
regarded as noisy observations of Rk = R{k6). We refer to Goldenshluger & Nemirovski (1997) 
for similar construction of the local polynomial estimators of derivatives in the context of the 
nonparametric regression model. 

The estimator Gh{xo) depends on two design parameters, the window width h and the degree 
of polynomial £] these parameters are specified in the sequel. 


4.2 Upper bound on the maximal risk 

Our current goal is to study accuracy of G/j(xo). For this purpose, we introduce the functional 
class of distributions G over which accuracy of estimator Gh{xo) is assessed. 
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Definition 1 


(i) Let j3 > 0, L > 0 he real numbers, and let I C (0, oo) he a closed interval such that xq G I. 

We define to be the class of all distribution functions G on R+ such that G is 

\_P\ times continuously differentiable on I, and 

|(^(L/5J)^2.^ — < L\x — '^x,y G /; 

here \_P\ stands for the maximal integer number strictly less than /3. 

(ii) We say that distribution function G on R+ belongs to the class ^p{K), p > 1, K > 0 if 

pOO 

= / px^~^[l — G(a:)]dx < K < oo. 

Jo 

(iii) Finally, we put 


Remark 2 

(i) The class ^p{L, I, K) imposes restrictions on smoothness in vicinity of Xq. In all what 
follows the point xq is assumed to be fixed. If xq is separated away from zero then we 
always consider a symmetric interval I centered at xq: I = [xq — d,xo + d] for some 
0 < d < Xq. In the case xq = 0 we set I = [0, 2d]. 

(ii) The definition of I, K) requires boundedness of the second moment of the service 

time distribution. This condition implies that the correlation sequence {H{kS),k G Z} 
is summable, which corresponds to the short-term dependence between the values of the 
sampled discrete-time queue-length process. This assumption can be relaxed. However, we 
do not pursue the case of the long-term dependence in this paper. 


Now we are in a position to state an upper bound on the maximal risk of our estimator. 


Theorem 1 Let xq be fixed, I := [xq — d,XQ + d\ C — >c)T] for some k & (Ojl): and suppose 
that G G ^p{L,I,K). Let G*(a;o) be the estimator defined in (12) and associated with the degree 
(■ ^ L/^J + 1 o-'nd the window width 


h = h^: := 


L^xT * 


If 




L2; 


V{I + 2)5\ 


then one has 




K{^/Ky\), i / 3 /( 2 / 3 + 2 ) 

+ A/ ’ 


kT 


(13) 

(14) 

(15) 


where G = G{(.) depends on I only. 


Remark 3 


(i) The upper bound in (If) originates in the requirement that the segment D^g contains 
at least £ + 1 grid points. This inequality is fulfilled if sampling is fast enough, 6 < 
0((>^^)“^/^^^“''^^). Thus, if the asymptotics as T ^ oo is considered then 5 should tend to 
zero so that (If) is fulfilled. The lower bound in (If) ensures that D^g C I. 
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(ii) The bound in (15) is non-uniform in xq; it is established for fixed xq < (1 —>c)T. The bound 
increases as x gets closer to 0 (xq approaches T). This is not surprising: the empirical 
covariance estimator is not accurate for large lags. However if Xg is large in comparison 
with T then it is advantageous to use the trivial estimator G{xo) = 1. The risk of G{xg) 
admits the following upper bound: 

n^,[G;^p{L,I,K)] < Kxa^ Vxq € M+. (16) 

Indeed, it follows from G G ./^ 2 {K) that for any x 

nOO nOO 

1 - G{x) = / dG(t) < / t^dG{t) < Kx-^. 

J X J X 

Thus, G{x) > 1 — Kx~'^, which implies (16). Comparing (15) and (16) we see that for 
Xq it is advantageous to use the estimator G*(xo); otherwise G{xg) is 

better. If more stringent conditions on the tail of G are imposed [e.g., G G ^p{K) with 
p > 2] then the zone where G,(a;o) is preferable becomes smaller. 


5 Estimation of arrival rate 


The construction of Section 4.1 that led to Gh{xo) can be used in order to estimate the arrival 
rate A from discrete observations of the queue-length process. 

Let I = [0, 2(i] and assume that G G ^p{L,I,K). Under this condition we can use relation 
(8) in order to construct an estimator of A. Indeed, setting t = 0 in (8) and taking into account 
that G(0) = 0 we obtain A = —i?'(0), where i?'(0) is understood here as the right-side derivative 
of R at zero. Therefore we define the estimator for A by 

A = - ^ (17) 

feeMon 

where Dg := [0, 2/i], {afe(0),fc G Mdo) is the solution to {3^g) [i.e., with x = 0], and Rk, 

k G Moa are defined in (9). 

The next statement provides an upper bound on the risk of A. 


Theorem 2 Let I = [0,2(i] and suppose that G G ,K). Let A* denote the estimator 

defined in (17) and associated with degree i > [/3J + 1 and window width 


h = ht, '.= 


K{^/Kyl) 


L'^T 


l/(2/3+2) 


If 


K{VKVl)L-‘^d-'^^-^ < T < K{Vkv1)L 


-2 


213+2 


then one has 


sup 

Ge^fiiL,I,K) L 


Eg.a|A*-A|^] < GL1/('5+i)(A2 + A)1/2 

where C = C{i) depends on I only. 


V{l + 2)6\ 

it:(VKV 1 )i/3/(2/3+2) 


T 


(18) 


(19) 


(20) 


Remark 4 

(i) The meaning of condition (19) is similar to that of (14), see Remark 3(i). 
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(ii) If sampling interval 6 is very small then one can build an estimator which is better than 
A*. In particular, if the continuous-time observation {X(t),0 < t < T} is available then 
alternative estimators of A can be constructed as follows 

= ^ff{t e (0,T] : X{t) - X{t-) = 1}, AJ- = e (0, T] : X{t) - X{t-) = -1}. 

Because arrivals and departures constitute the Poisson process with intensity A, the mean 
squared errors of X and A"*' are given by 

Eg,a|A^ - A|^ = Eg.aIA-^ - A|^ = Ar-\ VA,VG. 

Thus, in terms of dependence on the observation horizon T, the risks of V and A"*" tend to 
zero at the parametric rate 0(1/T). This rate is faster than the one in (20). 

6 Estimation of covariance function derivative 

Theorem 1 indicates that under suitable relation between observation horizon T and sampling 
interval S the service time distribution G can be estimated with the risk of the order T“/5/(2/3+2)_ 
In particular, for our estimator G*(xo) 

n,,[G,-<^p{L,I,K)\ X o(r-/3/(2/3+2)), oo, 

provided that (14) holds. A natural question is if this rate of convergence is optimal in the 
minimax sense. This is the question about lower bounds on the minimax risk IV),,^^p{L,I,K)\. 

Although statement (hi) of Proposition 1 provides complete probabilistic characterization of 
finite dimensional distributions of the queue-length process {X{t),t G K}, there is no explicit 
formula available for the distribution of X". Because all existing techniques for derivation of 
lower bounds on minimax risks rely upon sensitivity analysis of the family of target distributions, 
such a derivation in the MfGfoo problem seems to be intractable. However, some understanding 
of accuracy limitations in estimating service time distribution can be gained from consideration 
of a Gaussian approximating model. 

Proposition 3 shows that if the arrival rate A is large, the finite dimensional distributions 
of {X(t),0 < t < T} are close to Gaussian. Thus for large arrival rates we can regard the 
queue-length process as a stationary Gaussian process. Furthermore, equation (8) shows that 
the service time distribution G is proportional to the derivative of the covariance function of the 
queue-length process. This characterization suggests that, for large arrival rates, estimating G 
is as hard as estimating derivative of the covariance function of a continuous-time stationary 
Gaussian process from discrete observations. Although there is no a formal proof for statisti¬ 
cal equivalence of these experiments, the assumption seems plausible. Therefore we study the 
problem of estimating derivatives of covariance function of a stationary Gaussian process from 
discrete observations. 

6.1 Problem formulation 

Let X{t), t € R be a stationary Gaussian process with zero mean and covariance function 

7 S Li(R). The corresponding spectral density / is given by 

/ OO PCCi 

7 (t)e®‘^*dt = 2 / 7 (t) cos(a;t)dt, w G R, 

-c» J 0 

and, by the inverse Fourier transform, 

/ OO poo 

/(a;)e“*‘^‘da; = ^ f{uj) cos(a;t)dw, t G R. 

-OO ^ 0 
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Suppose that we observe process {X(t),t G M} on the time interval [0,T] at the points of the 
regular grid ti = iS, i = 1,... ,n, where <5 > 0 is the sampling interval, and T = n5. Our goal 
is to estimate the first derivative, 6 = 0 ( 7 ) := 7 '(a:o), of 7 at fixed point xq £ (0 , 00) using the 
observation X" = {X(fcd), fc = 1 ,..., n}. 

Since the distribution of X" is completely determined by the covariance function 7 (or spectral 
density /), we write and for the probability measure and the expectation with respect to 
the distribution of X" with covariance 7 . 

We measure accuracy in estimating 0 ( 7 ) = "/'{xq) by the maximal risk: for any estimator 
§ = 0(X") we let 

= sup [E^|0 - 7'(xo)| 

-yG'tf 

where is a class of target covariance functions. The minimax risk is defined by 7^*^ [‘^] = 
\x\igTZxo\S]^], where inf is taken over all possible estimators. 

In order to relate the MjGloo estimation problem to the present setting let us point out 
some properties of covariance functions R{t) = pH[t) corresponding to the service time dis¬ 
tributions G G I, K). First, (3) implies that if G G [see Definition l(i)] then 

R G Second, the employed moment condition G G ^^{K) in the MIGjoc prob¬ 

lem boils down to summability of the covariance sequence {7?/c}. In the context of estimating 
derivative of the covariance function this will be assumed directly. 

The above remarks motivate the next definition. 

Definition 2 Let xq be fixed, and I := [xq — d,XQ + d\ C (0,oo). For L > 0, /3 > 0 we say that 
a covariance function 7 G Li(]R) belongs to the functional class ^p{L,I,K) if 

(i) l 7 W|dt <K < 00 ; 

(ii) 'y is £ := maxjfc G N : fc < /I -I- 1} times continuously differentiable on I and 

| 7 ^^^(x) — 7 ^^^(x')| < L\x — Vx,x' G I. 

Similarly to the definition of /, K) in the MjGjco estimation problem, we assume local 
smoothness around the point xq only. Note also that the regularity index of 7 G ^p[L,I,K) 
equals jd + 1. We are mainly interested in bounds on the minimax risk I, K)]. 


6.2 Estimator and bounds on the minimax risk 


An estimator of 0 = 0 ( 7 ) = 7 ^(xo) can constructed exactly in the same way as the estimator of 
G in the M/G/oo problem. Specifically, if = [xq — h,xo -I- h], and if {afc(xo),fc G is 

the solution to optimization problem {£^xo) then we let 

9h — 'y ^ ^k{x^f)R}^, (^t) 

where Rk = ■:yZk YlZi ^i^i+k [cf. (9)]. Note that there is no need here to estimate the mean 
of X{t) as it is assumed to be zero. 

Accuracy properties of 9h are very similar to those of Gh{xo). In particular, using basically 
the same arguments as in the proof of Theorem 1 we can establish the following result. 


Theorem 3 Let I = [xq — d, xq -I-d] C (0, (1 — x)T] for some G (0,1), and let 7 G ^p{L,I,K). 
Let 0* = 9h, be the estimator (21) associated with i > [/?J -|-I and h = h^, := [K /. 

If 

2 1 2/3+2 


KL 


- 2 .- 1 , 


-2/3-2 


< T < 


KL-'^x-^ 


l{£ + 2)6\ 


II 





then 


/3/(2/3+2) 




The proof of the theorem is omitted. 

Thus, the maximal risk of 0* converges to zero at the same rate as the risk of Gh, (xq) in the 
MjGjoo estimation problem; cf. Theorem 1. 

The next theorem shows that this rate of convergence is, in a sense, best possible. 

Theorem 4 Let I = [xq — d,xo + d] C (0,oo). There exist constants Ci and G 2 depending on 
P, Xq, d and K only such that if 

Gi5~‘^ < T, L^T < ( 22 ) 

then 

Iminf {l-i/(^+i)T^/(2/3+2) > OO, 

where c = c(/3, xo,d,K). 

It is worth noting that the lower bound is established under condition T > Ci5~'^ whereas 
Theorems 1 and 3 do not require it. We were not able to relax this condition in Theorem 4. 

Comparing the results of Theorems 3 and 4 we conclude that the estimator d* is rate optimal 
for the indicated range of T and S. Due to relationship to the MfGjoo estimation problem, 
this strongly suggests that the estimator of the service time distribution of Section 4 is also rate 
optimal. 


7 Proofs 


7.1 Proof of Proposition 1 

For any m > 1 we write 

m ^ m 

Eg exp OiXi'^ = Eg I Eg 1 ^ exp {S ez} 




2=1 


(23) 


By (2) and by independence of {Tj,j € Z} and {crj,j G Z}, the conditional expectation in (23) 
takes the form 


Eg 


m 

exp I ^ {tj, j G Z} 


2=1 


= Eg 


exp 


771 

-d)| |{rj,j G Z} 


i 2=1 


m 

= PEg exp|^6>jl(7 


exp ■; 2^ < ti,aj > ti-Tj 

2 = 1 


{t„j G Z} . (24) 


Given a; G K consider partition of the real line by the intervals Iq(x) = (—oo,ti — x], Ik[x) = 
{tk — X, tk+i — a;], fc = l,...,m—1, and Im,{x) = {tm, — x, 00 ). With this notation 


Eg 


exp <; 2 _^ < ti, aj > ti- Tj 

i=l 

k 


{TjG e z} 


{^dzl(T 

2=1 

PgWj G loiTj)} + E exp {^6ia(D < ti)}PG{o-a G 4 (d)} 

PgID' G 4(d)}. 


A;=l 


2=1 


= 1 + E [exp {XI ^*1(4 < 

k—1 2=1 


U) - 1 
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If we let 


( m k \ 

1 + 'Y2, [ 6xp {E 0 * 1 ( 0 ; < ti)| - ljPG{crj G Ik{x)}j, 
then in view of (23), (24) and Campbell’s theorem (Kingman 1993, Section 3.2) we obtain 

Ecexp I ^ = Ecexpl ^/(rj)| = expjA f - l]da; j. 

Denote Sm{S) = M— l]da;; our current goal is to compute this integral. We have 

/oo m—1 /.cx) ^ 

[g/(2;) _ i]da; = X] / (exp { ^ 0il(a: < ti)} - 1 ) [G(4 - s) - G(tfe+i - a;)]da; 

fc=i i=i 

/ OO . 

(exp { ^0il(x < U)} - - a;)dx 

i=i 

m — 1 

k^l 

where we denoted for brevity G = 1 — G. For A: = 1,..., m — 1 we obtain 

k f, 

Jfc = (exp { y] 0 *} - 1 ) f [G{tk -x)- G( 4+1 - x)] da; 

i=i 

+ y](exp{ y] 61*}-1) / [G(tfc - a;) - G(tfe +1 - a;)]da; 

3 = 1 i =3 + l '^*■0 

k 

= “ ^ 1 ) “ H{tk+i - ti)] 


2 = 1 

k—1 k 


+ M X (®^P { X ^i} - ^)[H{tk - tj+i) - H{tk - tj) - H{tk+1 - tj+i) + H{tk+1 - tj)] 


3 = 1 i=3 + l 

k 


= il(exp 


{J20,}-l)[Hk-3-Hk] 


2=1 

k — 1 k 


^ X (e^P { X - 1 ) + Hk-3+i]- 


(25) 


i=i 


i=3 + l 


Similarly, 

m m—1 m 

= ^(exp { y]0i} - l)iJ(tm - G) + ^X(®^p{ X 6*} - 1 ) [i?(tm - G+i) “- G)] 


*=i 


i=i 


i=j+l 


*=1 3 = 1 i=3 + l 

The usual convention J2T=3 = 0 if m < j is employed in (25) and (26) and from now on. 

Note that by definition Sm{0) = + /aT***, and we have the following recursive 

formula 


(26) 


'5*m+l(6) — *5****(0) + ^{Jm Gjfi + i/***_|_i). 


(27) 
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For any m > 1 , using ( 25 ) and ( 26 ), after straightforward algebraic manipulations we obtain 
“I” ^m+l) 

m —1 

= (ei:r=i - 1 ) ^ - l) 

i=i 

m —1 

_ (ei:r=iei _ - Y, 

i=i 

m 

+ _ i)i7„ + ^ (e^r4ii«- _ i)(ij„_, - 

i=i 

m 

= (e®™+i - 1) + '“+1 - l)e^-—^+2»i(e»-+i - l). 

fc=l 


Taking into account that Si{6) = — 1 and iterating the formula ( 27 ) we obtain 

n n m 

Sn+i{e) = (e®i-l)+y](e®-+^-l) + '=+i-l)e^™=™-^+^®-(e®"‘+i 

m—1 m—1 k—1 

n +1 n n 

= X! “ 1) + X! (e®'"-'“+^ - - !)■ 

m—1 k—1 m—k 

This completes the proof. 


1 ) 


7.2 Proof of Proposition 2 


The proof involves straightforward though tedious differentiation of ( 6 ). 

Let S{ 9 ) stand for the right hand side of ( 6 ), where {Oi, 62,03,64) is replaced by { 0 i, 9 j, 9 k, 0 m) 
for convenience. Denote ^( 0 ) = EGexp{ 0 iXi + OjXj + 9 kXk + 9 mXm}- It is checked by direct 
calculation that 


d9,d9,d9kd9m 


exp{-p 5 '( 0 )} [ai( 0 )p + a2{9)p^ + 03(0)^^ + 04(0)^^], 


(28) 


where ai(0),02(0),03(0) and 04(0) are given by the following expressions: 


ai( 0 ) = Se^ej9kB^, 

02 ( 0 ) = Se^e-g^Se^ + Sg^g^g^Sek + Sg-g^g^Sg^ + Sg-g^g^Sg^ + Sg^g^Sg^g^ + Sg^g^^Sg^g^ + Sg^g^Sg.g^, 

03(0) = Sg^gjSg^Sg^ + Sg^g^Sg^Sg^ + Sg^g^Sg^Sg^ + Sg-gf^Sg^Sg^ + Sgjg^Sg^Sgf^ + Sg^g^Sg^Sg^, 

04 ( 0 ) = Sg,Sg.Sg,Sg^. 

Here we put for brevity Sg^^...g^^ = Sg^^.-.g^^ (0) := d^S[9)/d9j^ ■ ■ ■ 39 . In fact, expression (28) 
is obtained by application of di Bruno’s formula for derivatives of composite functions [see, e.g., 
Riordan (1958, Chapter 2)] to (6). 

In order to complete the proof, it is sufficient to note that 


5(0) = 1, 5^40) = !, Vj, 


and for any ji < j2 < J3 < Ja 




( 29 ) 


(30) 


14 



Although (30) is proved for 1 < z < j < /c < m < n, a similar result holds more generally. 
With the introduced definition of g/, (29), (30) imply that 

04 ( 0 ) = 1 

^3(0) ^\k—i\ d” d” ^\k—j\ d" 

“ 2 ( 0 ) = d- 

ai( 0 ) = • 

This completes the proof. | 


7.3 Proof of Theorems 1 and 2 


Throughout the proof Ci,Ci, i = 1,2,... stand for constants depending on £ only, unless it is 
mentioned explicitly. The proofs of both theorems are almost identical. We first prove Theorem 1 
and then indicate modifications needed for the proof of Theorem 2. 


It follows from ( 8 ) and (12) that 
Ghixo) - G{xo) = 


ak{xo)Rk - R'{xo) 


akixQ){Rk - Rk) + ak{xo)Rk-R'{xo) 


Therefore 


Eg|G/i(xo) - G(xo)p 


1/2 


< I 


{Eg ak{xo){Rk — Rk)^ I 


2 ') 1/2 


k&Mo^ 


y^ akixo)Rk - R'{xo) . (31) 


In the subsequent proof we bound the expression on the right hand side of the above display 
formula. The result of the theorem will follow from series of lemmas given below. 

We begin with a well known result on the properties of the local polynomial estimators; see, 
e.g., Nemirovski (2000, Lemma 1.3.1) and Tsybakov (2009, Section 1.6). 

Lemma 1 Let {ak{xo), k G } be the solution to (t^x), xnd let (10) hold; then 


^ |afe(a;o)|' 


1/2 


< 


Gi 


l«fe(a^o)| < 

k&Mo,:^ 


C 2 
h ’ 


(32) 


where Gi = Gi (£) and G 2 = C 2 {£) are constants depending on £ only. 

The next result establishes an upper bound on accuracy of the empirical covariance estimator. 


Lemma 2 For any fc = 0,..., n — 1 one has 

n 

FG\Rk-Rk? + 

i=l 

where G 3 is an absolute constant. 
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Proof: Let Rk := ^”=1 “ p)(^»+fe - p); then EcRk = Rk, and by definition of pk 


n—k 


Rk — n—k ^ ^ Pfc)(-^i+A: Pfc) — Rk (,Pk p) ■ 

Therefore 

EclRk -Rk\'^ = EclRk - - 2EG[RkiPk - p)^] + EclPk - p\^ =: Ji - 2 J 2 + Ja- (33) 

Now we proceed with computation of the terms on the right hand side of (33). 

1°. Computation of Ji. 

Let rk := EoiXiXi+k] = Rk + p^ = pHk + p^ and fk ■= XiX^+k', then 


1 — k 


Rk ~ Rk — fife ~ ?'fe + 2p^ — {Xi + Xi^k)- 


i=l 


Thus 

Jl = R^G\Rk — Rk\'^ 


n—k 


= EG|fife-rfe|2-2EG (fife-rfe);^^(X, + X,+fe) + Eg 2p2 _ ^ ^(X, + X.+O 


n — k 






(34) 


Equality (7) of Proposition 2 implies that for any fc = 0,...,n and i,j = 1,... ,n — k one has 

Eg [XiXi^kXjXj+k] = [Hk + R\j-i\ + H\j-i+k\ + H\j -i-k\ + H\j-i\ + Hk] 

+ P^ [2fi^fev|j — i\V\j—i — k\ ^R-kV\j—i\W\j — i-\-k\ R-k R\j—i\ R\j—'i-\-k\R\j — i—k\\ 

T pRk\/\j — i\V\j — i-\-k\'\/\j—i—k\' 


Since + ‘^p'^Hk + P^H\ 


k 5 


i — k 


= EGlfife - rfep = Y, Eg [Xa^+kXJX,+k] - r 




n—k 




fc7 X] +-^b-i+fc|] + pHkv\j 


-i|V|j-z-fe|V|j-i+fe| 


(35) 


i,i=l 


+ p^ \2Hky\j -i|V|j-i-fe| +2Hky\j-i\y\j-i+k\ + H'^j-i\ + fi^|i-i-fe| fi^|i-i+fe| ] | • 

Furthermore, 

n—k n—k 

^ = V - ^Eg X(^* + + Eg X + ^.'+0 


2=1 

n—k 




- -4p'‘ + X + ’"b-i+fcl + 




'i—k 


~ {n-ky X ■ 

* j=i 


(36) 


( 2 ) 

Now we proceed with Jj : 

n—k 

4 ^^ = ;^XEG(fifc-^fc)(^*+^*+fe) = ;^X [EG(fife^^+fifc^^+fc)-2p(p2 + pi^fe)]. 


1 — k 


2=1 


2=1 
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We have 


n — k 

i=i 

n—k 

= ;r^ X! \-P^ + + P^^\o-i\ + p^H\j_i_k\ + p7?fcvii-i|v|i-i-fc|] 

i=i 

n—k 

EcihX^+k] = ^Y.^G[XjX,+kX,+k] 

i=i 

n—k 

= TT^ X! ^ P^^\o-i\ + P^H\j-i+k\ + pi^fev|J-i|V|^-j+fc|] , 

i=i 

which yields 

n — k 

Jl = (n-ky-^ 'y \?P ^\3-i\ P P^\i-3+k\ + P ^|i- j-/c| + P^^^|i-j|VfeV|i-j-/c| + P^^^|i-j|VfcV|i-j+fe|] • (37) 
i,j=l 

Combining (37), (36), (35) and (34) we obtain 

n—k n—k 

•^1 “ (n-k)^ ^ + (n-k)^ ^ v|i_j_fc|v|z-j+fc| • 

i,j=l j.i=i 

Taking into account that 77 is a monotone decreasing function, and 77(0) = 1 we obtain 

n 

Ji ^ yy^ip + p) ^ 

where ci is an absolute constant. 

2^. Computation of J 2 . It follows from the definition of J 2 that 

J 2 = p^Hk — 2pEG[RkPk] + EG[RkPk]- 

We have 

n—k 

Eg [RkPk] = {n\y^ ^ {Xi - p){Xi+k - p)Xj 

ij = l 

n—k 

* J = 1 

n — k 

= p^Hk + ^ 77fev|z-j|v|z-j+fe| ■ 

i,i=l 
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Furthermore, 


'i—k 


^G[Rkpl] = 


*0,^=1 


X,X,+kXjXi - pX,+kX,Xi - pX.XjXi + p^XjXi 


n — k 

E [ 


(F=feF I Rky\i-j\y\i-]+k\ + i?|i_i|v|i-i+fe|vfc + HkH\j_i\ + 

n—k 

+ P^Hk + X] '^fcv|i-i|vK-z|v|i+fc-j|v|i+fe-z|v|j-z| 




i—k 


3 ^ 

= p Hk + ^ 2 _fffcv|i_j|v|i-j+fc| 


* j=i 


'i—k 


”*"(ri-fc)3 {RkHy-i\ + + pi?fcv|i-i|v|z-i|v|i+fc-j|v|i+fc-;|v|j-;| 

i,3,1=1 

Combining these equalities we obtain 

T. —fe 


J 2 = 




E [p^ -|- -ff|2_/-(-/i;| “t“ -^li —^1 j+A:| ) “t” P-^fcV|2—j|V|2—/|V|2+fc—j|V|z+/i:— /|v|j — /| 




< ai(p^+p)E^- 


where C 2 is an absolute constant. 

1 ^^n—k 

i—k E/i—1 

after routine calculations we obtain for all i, j, 1 , m = 1 ,..., n — fc 
Eg [(X, - p){X, - p){Xi - p){X^ - p)] 


3°. Computation of J 3 . By definition, J 3 = Eg| ~ P)\‘^- Using Proposition 2 


= P 
so that 


+ i?i z—/| ^\j—m\ + H\i 

—m|] “ 1 “ j|V|j; —/|V|/—m|V|/ —m|v|z—mlVl-i—/|; 


'i—k 


•h — {n-ky E R\i-j\R\l-m\ + H\i-l\H\j-m\ + H\l-j\H\i-m\] 

n—k n 

^ ^ ^jz—_7|V|j —Z|V|/—m|V|^ —m|v|z—m|V|2—/| E n—k p) : 


P 

{n—k)‘^ 


z,j,/,m=l 

where C 3 is an absolute constant. 

Combining inequalities for Ji, J 2 and J 3 with (33) we complete the proof. 


Lemma 3 For every xq € [0,T — i5] one has 

Eg ^ ak{xo){Rk - Rk) 


2 Ci5 , 2 N „ 

< — 7t=t{p + p) Hi, 

i=l 


h^MT) 


where = C 4 (£) is a constant depending on i only, and 

T — Xq — h, h<Xo<T —6 — h, 
i^xaiT) = ' 4 ixo{T,h, 5 ) ■.= { T- 2 h, 0 < Xq < h, 

5, T — S — h<xo<T — S. 
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Proof: By Lemmas 1 and 2 and by the Cauchy-Schwarz inequality 


E 


^ ak{xo){Rk - Rk) < Y. E ^G{Rk-RkY 




k&Mo, 




^xn »•_1 i.^7i,r 


(38) 


i —1 

Let k = min{A: € (1,..., n — 1) : A: € } and k = max{A: € (1,..., n — 1) : }; then 

k 




k — k\ k — k 


keMo^ 


k=k 


— k 


First, assume that D^g = [xq — h,xo + h]. In this case k = [(xq — h)/S] + 1, fc = [(xq + h)/6], 
where [•] is the integer part, and then J2keM — k) < 2h/{T — xq — h). If D^g = [0, 2h] 

Xq 

then k = 1 and k = [2h/6] which leads to — k) < 2h/(T — 2h). Finally, if 

Dxo = [T — 2h — 6,T — 6] then k = n — 1, k= (n — 1) — [2h/6], and J^keMo — k) < 2h/6. 

Combining these bounds with (38) and taking into account that {2h/S) — 1 < Nn^g E 
{2h/5) + 1, we complete the proof. ■ 


Lemma 4 Let G € /), J = [xq — d, x^ + d] 2 D^g, and {afe(xo), k € } he the weights 

defined by (S^xo) with i> [/3J +1. Assume that (10) holds; then 


'y ^ ak(,xo')Rk R (^o) 

keMo^g 


< C 2 \Lh^, 


where C 2 = C' 2 (f) is the constant appearing in (32). 


Proof: Recall R{t) = + ph[t) = p^ + \ — G(x)]dx; this implies 

R'{t) = -A(I - Git)), R^^\t) = XG^^-^Ht), Vj = 2,..., [fi\ + 1. 

Thus if G € J^piL,I) then R G J^p+iiXL,I). Since Dxg Q I, function R can be expanded in 
the Taylor series around Xq. In particular, for any k G -^Dxq 

1/31 

R{kS) = i?(xo) + ^ j,R^^\xo)ikS - xoY + ^^^^i?(L/31+i)(^^)(fcj _ a.o)L/31+i, ( 39 ) 

i=i 

where = rkS + (1 — t)xo for some r G [0,1]. Denote 

1/31 + 1 

Rxoiy) ■= Rixo) + Y - xo)fi ye Dxg. (40) 

i=i 

Because Rxoi') is a polynomial of degree [/3J + 1 and £> [/3J + 1, we have by (11) that 

Y akixo)RxoikS) = R'xgixo) = R'(xo). 

keMD,,g 

Therefore 

Y akixo)R{kd) - R'ixo) = Y akixo)[Rikd) - RxgikS)] 
keMD,,g fee+fcxo 

= E j^^akixo)[R^^^^+^\^k)-R^^^^+^Hxo)]ikS-xo)^^^+\ 

k€MD,cg 
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where we have used (39) and (40). This yields 


akixo)R{kS) - R'(xo) 


< 


XLh^+^ 

(L/3J+1)! 


E 


\ak{xo)\ < C' 2 ALh^, 


where the last inequality follows from (32). 


Now we complete the proof of Theorems 1. 

First we note that because G G ^ 2 {K)^ 

n n 

Y.H, = y^i?(*<5)<i/ H{t)At 
2—1 2—1 

pT pOQ poo 

= j / [1 — G{x)]dxdt < ^ / x[l — G{x)]dx < ^K. (41) 

Jo Jt Jo 

Moreover, G € ^^{K) implies also that A < ^/H. 

It can be easily verified that under (14) and (13) for all T large enough we have T — xo> >cT, 
and Dxo contains at least l + l grid points. Therefore, by Lemmas 3 and 4 and (41), the chosen 
window width h = balances the upper bounds on the two terms on the right hand side of (31). 
The result of Theorem 1 follows immediately by substitution of h* in the bounds of Lemmas 3 
and 4. 


In order to prove Theorem 2 we note that the bias-variance decomposition in the problem 
of estimating A takes the form 


Eg,a|A — A|^ 


-, 1/2 


< 


{Eg,. 


E 


-, 2 -, 1/2 

ak{xo){Rk - Rk) / + 2^ ak{xo)Rk 


cf. (31). The same upper bounds on the bias (Lemma 4) and the variance (Lemma 
upper bound (20) follows by the special choice of the window width in (18). 


R'ixo) ; 


3) hold. The 


7.4 Proof of Theorem 4 


The following notation and definitions are used throughout the proof. 

If 4 = is an n X n matrix then ||4||2 = sup|| 3 .|| 2 <i ||4a;||2 is the spectral norm 

of A, and ||4||ir = Frobenius norm of A. 

Let V be an integrable function on [—tt, tt]; its Fourier series is given by u(w) = , 

Lo G [—7r,7r], where the corresponding Fourier coefhcients are 


'^3 = ^ / v{uj)e j e Z. 


For an integrable function v on [—7r,7r], let Tn{v) stand for the n x 
elements 

[Tn{v)]j,k = Vj-k = ^ f v{uj)e~"^^~^'>‘^duj, j,k 

J —TT 


n Toeplitz matrix with the 
= 1,... ,n. 


20 











7.4.1 Auxiliary results 


The following result is stated and proved in Davies (1973). 

Lemma 5 Let A be an n x n matrix sueh that ||A ||2 < 1; then 

I logdet(/ + A) - tr(A) + itr(A2)| < ^\\Ah\\A\\%{l - ||A|| 2 )-^ 


In the proof of Theorem 4 we use properties of Toeplitz’s matrices which are presented in the 
next lemma. Some of these statements can be viewed as “finite sample” versions of asymptotic 
results from Davies (1973) and Dzhaparidze (1986). 

Lemma 6 Let v,u € Li([—7r,7r]) fl L 2 ([—tt,tt]) be functions with the Fourier coeffieients {vj} 
and {uj} respectively. 

(i) Let sup^g[_^_^] z;(u;) < M < oo; then ||T„(z ;)||2 < M. 

(ii) Let inf^g[^ ,r] ^(w) > m > 0; then ||T)7H^)ll2 < 

(hi) \\Tniv)\\l < ^ Jff^\v{uj)\^duj. 

(iv) Suppose that |r'(a;)| < Mi < oo, and J2jL-oo P — -^2 < oo for some constants Mi 
and AI 2 ; then 

\\Tr,ivu) - Tr,iv)Tr,iu)fp < dMlM2. 

(v) Let conditions of (iv) hold, and let inftjg[_ 7 r, 7 r] > m > 0; then 

\\T,,{vu)T-\v) - T„(n)||| < Am-^M^M 2 . 


(vi) Let conditions of (iv) and (v) hold; then 

/ TT 

u^(w)dw + 8m~‘^MiM2, 

-TT 


Proof: The statements (i), (ii) and (hi) are standard. See Grenander & Szego (1984, p. 64) for 
(i) and (ii), while (hi) is an immediate consequence of Parceval’s equality: 

n—1 00 

\\Tu{v)\\l = (n -\j\)\vj\^ < n = ^ 

j—l — n j — — cc) 


|ri(a;)bda;. 


(iv). Denote w{ui) := v{uj)u{ui). By Parceval’s equality Wj = d ^ There¬ 

fore the (j, fe)th element of matrix Tn{w) — Tn{v)Tn(u) equals 

OO n 00 

Y. ^ viUj-k-i — Y^ 'ViUj-i-k 

l=—oo 1—1 l= — oo l=j — n 

j—n—1 00 

= Y VlUj-k-l + Y, VlUj-k-l- 


1 — — C 




Hence 


n n j—n—1 

\\Tn{vu) - Tn{v)Tn{u)\\% < 2 EE| E VlUj-k-l 

j—l k—l 1— — 00 


n n oo 


2EE|E VlUj-k-l 

j=l k=l l=j 
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Consider the first term; the second term is bounded similarly. Let A denote the backward shift 
operator on the space of two-sided sequences: (Au)j = Uj-i, j G Z. For fixed k and n let 
be the function on [—7r,7r] whose Fourier coefficients are {{A^u)jl{j > n + l),j G Z}. Then 
with the introduced notation, 


n n j—n—1 


j — 1 k—1 1 — — QO 


2 


n oo CO 2 

- XI XI I XI - l>n + l} 

k—lj— — <x> 1— — 00 

n pTi TL pTi 

= 2fX/ 

fc=i k=i 


= ^iX X l(AMA{/>n + l}|'<M2^/u?, 

k—1 l— — (x> 1—1 


where the second and third lines follow from Parceval’s equality and the premise of the statement. 


(v) . We have 

\\T-\v)T^{vu)-T^{u)\\f = \\T-\v)[T^{vu)-T^{v)T^{u)]\\^ 

< \\T-\v)h\\Tr,ivu) - Tr,iv)T^iu)\\F. 

Then the statement follows from (ii) and (iv). 

(vi) . We have 

tr{[T-\v)T4vu)f -T^iu)]} 

< \\T-\v)T^{vu) - r„(u)||F [||r„(M)||;^ + \\T-\v)Tr,{vu)\\F . 

Clearly, 

\\T-\v)T„{vu)\\f < \\T-\v)Tr,ivu)-T„iu)\\F + \\Tniu)\\F; 
hence, by (v) and (hi) 

^r{[Tn\v)Tr,{vu)f - T^{u)]} 

< \\T-\v)T^{vu) - T^{u)\\% + 2\\T-\v)T^{vu) - T^{u)\\F\\Tr,{u)\\F 

_ pTZ 

< M2 + 2 m~^Mi\/M2 ^ / u^(a;)daj 

*- J —TT 

Therefore using (iii) we obtain 

tT{[T-\v)Tnivu)f} < ti{T^{u)} + dm-^M^Mz + 2m-^Miy^ 


1/2 


if(w)daj 


11/2 


< -2^ 
— TT 


f u^(w)da; + 8m ^M^Mz, 

J —TT 


as claimed. 


7.4.2 Proof of Theorem 4 

The proof is based on standard reduction to a two-point hypotheses testing problem [cf. Tsy- 
bakov (2009, Chapter 2)]. 

Throughout the proof the following notation and conventions are used. We use symbols 
Co, Cl,..., Co, Cl,... to denote positive constants depending on /3, xo, d and K only, unless 
explicitly specified. 
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0°. Reduction to a hypotheses testing problem. Let xq be fixed, / = [xq — d, xq + d] C (0, c»), 
and let 70 and 71 be a pair of covariance functions from I, K) such that 


a ■■= |0(7o) - ^'(7i)l = l7o(a:o) - 7Ua;o)| > 0. 


(42) 


For arbitrary estimator 9 we have 


T^xo[0',‘r^l3{L,I,K)] > sup E^| 0 - 6 »( 7 )p > sup - 6 »( 7 )| > f} 

7 e{ 7 o, 7 l} 76(70,71} 

P7o{l^-%0)| > f}+P7l{l^-%l)l > §} 


> 


1 „2 


(43) 


Suppose that on the basis of observations = {Xi,..., Xn) we want to test the hypothesis 
Hq : 0 ( 7 ) = 0 ( 70 ) against the alternative Hi : 0 ( 7 ) = 0 ( 71 ). Assume that for this purpose we 
apply the following minimum distance testing procedure tjj^X’^): given an estimator 6 we accept 
the ith hypothesis, i = 0,1 with 0 ( 7 i) closest to 0, i.e., '0(A1") = argminj=o,i Then, 

by the triangle inequality, the expression on the right hand side of (43) is not less than the sum 
of error probabilities of the minimum distance test: 

'R,,[e-^p{L,I,K)] > y[P^„{^iX^) = l} + P^,{ij{X^)=0}] > ia27r(P^„,P^J,(44) 

where 7 r(P,yp,P,yJ = inf,^[E-yo(l — 1 ^) + is the testing affinity between P,yjj and P,y^; the 

infimum is taken over all tests measurable with respect to the observation X”. 

Thus the problem is reduced to constructing the worst-case alternatives 70 and 71 , and 
bounding the testing affinity 7 r(P-yg,P-y^). The last step will be accomplished by bounding from 
above the Kullback-Leibler divergence /C(P,y(,,P-yJ = E,yj log [(dP,yj/dP,y,o)(Ar")] between P,y,Q 
and P,y,^ because 

7r(P7o:P7i) > |exp{-/C(P-yo,P^J}; (45) 

see, e.g., Tsybakov (2009, Theorem 2.4.2). 

Construction of the worst-case alternatives. Let (f be an infinitely differentiable even 
function with the following properties: 

= j^j> 3 / 2 , 0<^(a.)<l, c^e[-|,-l]u[l,|], (46) 

and (f) is monotone on 1] and [1,§]- Because (j) is an infinitely differentiable function 

with bounded support, the inverse Fourier transform oi cj) is a. rapidly decreasing infinitely 
differentiable function: for all m, fc £ N 

/ CO 

(-iw)'=^(™)(w)e-*‘"‘dw ^ |(^('=)(t)| < C(m,fc)|tr™, Vt. (47) 


Let I be an even integer number, and let 

= (l[-i.i] * • • • * l[_i,i])(t), t £ M, 

I 


where * stands for the convolution on K. Put f{t) := Q{it/{xo — d)). Clearly, supp(C) = 
[—Xq + d, Xq — d], and the Fourier transform of C is 


/. 


C{^)= / C{t)e"‘^*<it=[{xo-d)/e] 


2 sin (a;(xo — d)/i) 
w(xo — d)/i 


OJ £ 


(48) 


Because I is even, function C is non-negative on R. 
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Let 


where iVo > 1 is an integer number to be specified, and define 

/o(w) = coi50(w(5/7r) + a [C(w - N) + ((uj + TV)], (49) 

where cq and Ci are positive constants. We claim that, under appropriate choice of cq and ci 
function fo is a spectral density with the corresponding covariance function 70 that belongs to 
'^/ 3 (c 2 L, d, TV) with preassigned C 2 G ( 0 , 1 ). 

By definition fo is non-negative and even on R; hence by Bochner’s theorem, it is a spectral 
density. The corresponding covariance function is 

/ OO 

/o(w)e““*da; = fcpiirt/S) + ^({t) cos{Nt). 

-OO 

Because supp(C) = [—xq + d, xq — d], 70 (t) = ^4’i'^t/^) for t ^ I = [xq — d, xq + d]. Then in view 
of (47), 

|7(^+i)(t)| = f|<),(/5+i)(^t/d)|(7r/d)(^+i) < fC{(3 + l,l3+l)\xo-d\-^-\ Wt G I, 

where is a constant appearing in (47). Choosing cq and Ci small enough we ensure that 

7 o G ^ 0 {c 2 L,I,K) with preassigned 0 < C 2 < 1. 

Now we proceed with definition of /i and 71 . With fo and (j) given by (49) and (46) respec¬ 
tively define 

:= /o(w)u; sin(a;xo) [^(^(w - TV)) -h ^(^(w -h TV))]. 

By definition, 

supp(^) = [ - TV - 3 ^,-TV + 3 |^] U [TV - 3 ^, TV + 3 ^]. 

For a function g on R we put 

/ OO 

g'^{uj) sin^(wxo)w^ [^(^(w - TV)) -h $(^(uj + TV))] dw, 

-OO 

and let 

/i(w) = fo(uj){l + (50) 

Let us verify that f\ is a spectral density. We have /i(w) = /i(—w) for all a; G R. To ensure 
that fi is non-negative it suffices to require that 

1 > C 3 LTV"^[BAr(/o)]”Vo(w)|a;|, Vw G supp(^/>). (51) 

By definition of (j), Bn and fo, 

BNifo) > 2 / /o(a;)sin^(a;xo)a;^da; 

dN--^ 

[C(a;)]2(a; + TV)2dw > c^N^, (52) 

where we took into account that sin(a;xo) > •\/3/2 whenever w G [TV — g^, TV -|- gf^]- Moreover, 
/o(w) < Cq for all w G R; therefore (51) will hold if 1 > ctLN~^~^. This condition will be 
ensured for large T by our final choice of TV [cf. (66)] . Thus fi is a non-negative function on R, 
and hence a spectral density. 


rN+, 


> C4 


IN- 


[C(w — TV)]"^a;"^dw = C4 


6x0 




6x0 

6x0 
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Let 7 i be the covariance function corresponding to fi. It is evident that choosing C 3 small 
enough we can guarantee that | 7 i(t)|dt < K. It remains to check the smoothness condition. 
For this purpose we observe that 

/ OO 

|w|'^'^^l/o(w)V'(w)|dw 

-00 

nOO 

< 2c3LN-^[BN{fo)r^ / ujP+^fS{uj)\sm{ujXo)\${^{uj - N))duj 
Jo 

/*oo 

< 2c 3LN~^(N + ^)^[BN{fo)]~^ / w^/o(a;)|sin(wxo)|^(^(w-IV))da; < csL, 

Jo 

where the last line follows from the fact that — N)) is supported on [N — 7 r/( 4 xo), N + 

7 r/( 4 a;o)], sin(a;xo) > I/-\/2 on this interval, and from the definition of B^ifo)- This, together 
with 7 o G '^p{c 2 L, I, K), means that 71 G ^p{L,I,K) by choice of constant C3. 


2®. Distance between the estimated values. We have 

pCO 

a = l7o(3^o) - 7i(a;o)| = i / [/o(w) -/i(w)]wsin(wxo)da 

Jo 

poo 

= fLN-P[Br,ifo)]-^ f^iu:)uj^sin^iu:xo)${^iiu-N))duj=§^LN-P, ( 53 ) 
Jo 

where the last equality follows from definition of i?Ar(/o). 


3*^. Spectral densities of the sampled discrete-time process. For generic function g on R denote 

00 

sH ■= i wG(-7r,7r]. 

j=-oo 


Under and P..yj, the spectral densities of the discrete-time process {X{k5)^k G Z}, are /o 
and /i respectively; see, e.g., Grenander & Rosenblatt (1957). By (50) 

C30 

7i(a;) = 7(c.) + c3LfV-7R^(/o)]-ii ^ /o(!^)^(^). (54) 

j=-oo 

In what follows we require that 

{N + ^^)5<^- (55) 

this condition will be verified by our choice of N. Under this condition, since function '0 is 
supported on [N — 7 r/( 4 a;o), + 7 r/( 4 a;o)] U [—— 7 r/( 4 a;Q), — A^ + 7 r/( 4 a:o)], the sum in (54) 
contains only one non-vanishing term corresponding to j = 0. Thus, 

7i(w) = 7o(w) +C3(5"n77"^[BAr(/o)]"Vo(w/(5)'0(w/(5) = Jo{uj)[l + g{uj)], 


where we have denoted 


ff(w) 


caS-^LN-^ 


f^ico/6){^/S) 

^Af(/o)/o(w) 


sin(a;a;o/(5) $(^(j 


^))+7(^(f+^)) 


( 56 ) 


The next lemma summarizes some useful properties of function g. 
Lemma 7 The following statement holds: 

r \giuj)\^duj < clL^N-^^[BNifo)]-^S. 


( 57 ) 
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In addition, if {gj} are the Fourier coefficients of the function g then 

CO 

_7^-oo 

where f^{u}) := Sf(^{u}) = E^-oo foi^ + ^)- 

Proof: Since /o(w) > 6~^fo{aj/S), 'iuj G [—7r,7r] we obtain 

f l5(w)pdw 

J — 7T 

< clL^N-^P[BN{fo)]-^ r f^{u;/S){ca/S)^sin\u;x,/S)U{^-f^{^-N))+^^{f+N)) 


do; 


= clL^N-^f^[BN{fo)]-^S f^{u;)uj^sm^u;xo)U{^{u;-N))+${^{u; + N)) 

J —7r/(5 


dw 


< clL^N-^^[BN{h)]-H, 

as claimed. The last inequality follows by definitions of (j) and Bn{-). 

Now we prove the second statement of the lemma. Write for brevity A = C 3 LN~^[BN{fo)]~^ 
and note that g{uj) = go{uj/S), where 

go{uj):=A\^ ^ /o(w + 2 | 2 ) f^(^uj)u}sm{uJXo)[${^{uj - N))+ '${^{uj + N))]; ( 59 ) 

j^-OO 

see (56). By the Cauchy-Schwarz inequality and Parceval’s equality 


i: liiiftf < (i: i: /lat) 


, 1/2 


j = -oo 

,1/2, 


\1I2 / s 1/2 „ , / r'^/^ s 1/2 

IsHPdwj ( |5r'(w)pdwj < C3T/V"'^[BAr(/o)]-i/y / |5o(u;)pdwJ , 

where in the last step we used (57). We proceed with bounding the integral on the right hand 
side. 

It follows from (59) that g'^ioj) = Em=i where 

OO 

Ji{uj)=Ay y] /o(w+2y) 2/o(a;)/^(w)a;sin(wj:o)[^(^(a;-/V))+^(^(a; + iV))], 

j^-CO 

CO 

J 2 {uj) = ^ foi^uj + foi^) sin(a;xo) [cj)[^{uj — N)'^ + + ^))] i 

j^-oo 

OO 

73 ( 0 ;) = ^!^ y] /o(a;+2y) /2(^)^;a;gcos(wa;o)[^(^(a;-/V))+<^(^(a; + Ai))], 

j^-CO 
CO 

J4(a.)=A[ /o(a;+2|2)]- 


J^-OO 


AM=- Eyf)y+2T,7<)p +?(S‘("+«))i I: /;(-+¥)■ 

Since Eil -00 /o(‘^ + 27rj7(5) > /o(w), 


^7r/(5 


-■n f S 


I Ji(u;)7dM < AA^BM) = 4c2l2/v-2/3[b^(/o)]-2b^(/'). 
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Similarly we obtain the following bounds 



I J2(a;)pdw 


< 



I Jfe(a;)pdw 
I J5(a;)|^dcti 


< 

< 

< 

< 


- N)) +^{^{u; + N))]^du; 

J —7r/(5 

f^{u:)u:^Bin\u:Xo)[^{^{u:-N))+^{^{u: + N))]du: 

J--R/S 

cL2N-2P-2^BN{h)]-\ 
cL^N-^P[BN[h)]-\ fc = 3,4, 


where /o(a;) = YlT=-ca /o(^ + ^)- Combining these results we come to the statement (58). 


4^^. The Kullback-Leibler divergence. Now we proceed with bounding the Kullback-Leibler 
divergence between the probability measures P.y(, and P.y^ generated by observation X" under 
hypotheses Hq and Hi. Under Hq the distribution of observation X" is multivariate normal 
with zero mean and covariance matrix Eg = Tn(/o); while under Hi the distribution is the 
multivariate normal P-^^ with zero mean and covariance matrix Ei = Tri(/i) = Tra(/o) + ’r„(/og). 
The Kullback-Leibler divergence between these multivariate normal distributions is 


^(^70 1 P 71 ) 


log 


dP, 


dp., 


-(X") 


ilOg 


det(Eo) 

det(Ei) 


in + ^tr(EQ ^Ei). 


Put for brevity U = T(/i) - T„(/o) = Tn{fog); then 

/C(P.yo,P^J = -i logdet(Eo ^Ei) + itr(E(C^Ei -/) = -! logdet(/ + Eg V) + itr(Eo V) 

= -in;(Eo-V) + itr{(Eo-V)2}, (60) 

where w{A) := logdet(/ + A) — tr(A) + itr{^^}. Our current goal is to bound the two terms 
on the right hand side on (60). 


First we note the following upper and lower bounds on /g. It follows from the definition of 
/g and (f> that 


hM = i E 2 E 


^^ .j+2wj ) ^ ^ y(w/;r) = co. 


On the other hand, 


/o(w) < Og + ciJ-i ^ [c{^-N)+({^+N)] 

j ^-00 

00 

< cio + ciS-^2^{xo-d)/i + ciS-^ Y. [c{^ - Y+ N)] 

i — — co 

< Cig + Cii(5“^ + < Ci3(5“^; 

here the third inequality follows from (48) and (55). Thus we have shown that 

0 < Co < inf 7g(w) < sup fo{uj) < (61) 
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We are in a position to bound /C(P-y(,, P^i) in (60) from above. The statement (vi) of Lemma 6 
together with (57) and (58) implies 

tr{(Eo-ip)2} = tr{[r-i(/o)r„(/og)]2} 

This yields the upper bound on the second term on the right hand side of (60). 

Now we proceed with bounding r(;(I]()'^P). Note that 

l|So V|l 2 = \\T-\fo)TMog)h < l|r„-'(/o)ll 2 ||r„(/off)|| 2 . (63) 

In view of the lower bound in (61) and by Lemma 6(ii), |lT'„'^(/o )||2 < Cq Using the definition 
of g [see (56)] we obtain 


fo{uj)g{u;) = c^S-^LN-f^lBNifo)]-^ f^{u;/6){uj/6) sm{ujxo/S) ^(^(f - N)) + + N)) 


< c36-^LN-P[BN{fo)]-^fSiio/S)iu;/S) < c.^S-^LN-^+^iBNifo)] 


-1 


where in the last inequality we took into account that 

supp(g) = [S{N - S{N + 41;^)] U [ - S{N + ^), -S{N - ^f;^)], 
and the definition of /o [see (49)]. Then it follows from (63) and Lemma 6(i) that 

'(/0)7’n(/05)||2 < ds^UiV-^+l [U^ (/q)] ' ^ . 


Let us require that the choice of N be such that 

ci5rUfV-^+i[Bjv(/o)]-' < 1/2. (64) 


Then Lemma 5 is applicable, and 

|u;(So-V)| = |u;(T-i(/o)r„(/og))| < |llT-i(/o)T„(/o5)||^. 

Since ||T“H/o)'rn(/off)|lF = tr{[r“i(^)r„(/o5)]2}, we obtain from (64), (62) and (60) that 

/C(P^,,P^„) < ci6L'fV-2^[i?^(/o)]-i{n(5 + r 2 [U^(/o)]-i/ 2 ([U^(//)]i /2 + [U^(/#)]i/ 2 ) J. ( 55 ) 


Recall that i?Ar(/o) > csiV^; see (52). Moreover, 


BNifo) 


< 


2 



[/o(‘^)]^‘^^da; < ci7 


[C'(w)]^(u; + Nfduj < c,sN^ 


where the last inequality is obtained from the definition of Ci’)- By similar argument one can 
show that i?Ar(/Q) < CigiV^, provided that i is large enough. Combining these inequalities with 
(65) we finally obtain 

/C(P^i,P^o) < C2oL‘^N-^^-^{n6 + 6-^) = 0201 “^N-^^-\T + 6-^). 
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Choice of N and proof completion. Pick integer Nq such that for some constant C 22 


N = N.,= C2l(L^r)^/(2/3+2) ^ 


( 66 ) 


With this choice under condition (22) we have 

+ 477 )^ ^ C2i(L2t)1/(2/3+2)^ + 7 r( 4 a:o)-M < C21C2 + 
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which is less than tt by choice of C 21 and as <5 —>■ 0. Thus (55) holds for all 6 small. 

Moreover, in view of (52) 

C^S-^LN-^+^IBnAIo)]-^ < Ci5C5S-^LN-P-^ = Ci5C5C^/-'5-lr-l/2 < Ci5C5C^/-^Cf\ 

where in the last inequality we have used (22). The left hand side of the last inequality is less 
than 1/2 for Ci large enough. Thus (64) is fulfilled. 

Finally, if = IV, then in view of (22), P.^^) < C 22 . Therefore the theorem statement 

follows from (53), (44) and (45). The proof is completed. | 
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